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DETAILED ACTION 

1. The text of those sections of Title 35, U.S. Code not included in this action can be found 
in a prior Office action. 

Response to Arguments 

2. Applicant's arguments filed 03/28/2006 have been fully considered but they are not 
persuasive. 

Rej ection of Cl^sXX^A^lZ^l^mdM 
Applicant argues that Gong (6,418,41 1) fails to disclose or suggest determining an 
identity of a speaker based, at least in part, on a user identifier, where the user identifier is one of 
a unique code entered at a beginning of a usage session, a telephone number, a terminal 
identifier, or an identifier based on a plurality of rules associated with a phone. Based on the 
new limitation added, neither Gong, Digalakis et al. (5,864,810), nor deVries (6,289,309) 
disclose or suggest the new limitation. However, Dragosh et al. (6,078,886) do teach wherein 
the user identifier is based the plurality of rules associated with the phone and at least one of a 
time of the day or a day of a week (col. 5 lines 32-37 and col. 6 lines 29-35; and col. 8 lines 65- 
67; and col. 9 lines 1-4). 

Rejection of Claims 2, 4, 6, 8 and 20 
Applicant argues that claims 2 and 4 either directly or as a base claim, which is 
patentable over Gong, Digalakis and deVries for at least the reasons provided with respect to 
claim 1 . And that Thrasher et al. (2002/0052742) fails to satisfy the deficiencies of Gong, 
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Digalakis and deVries. Examiner respectfully disagrees. Thrasher et al. teaach confidence 
measure that detects word lattices that were improperly identified, page 3 paragraph 35 lines 2-6. 

Rejection of Claims 3. and. 7 
Applicant argues that Thrasher and Waibel fail to satisfy the deficiencies of Gong, 
Digalakis and deVries. Examiner respectfully disagrees. Thrasher et al. teach the confidence 
measure and Waibel teach repeating confidence scoring and the score is compared to a 
predetermined threshold to repair misrecognition of speech, col. 1, lines 9-12 and 56-59. 

Claim Rejections - 35 USC §103 
3. Claims 1, 5, and 9-19 are rejected under 35 U.S.C. 103(a) as being unpatentable 
over Gong (6,418,41 1) in view of Digalakis et al. (5,864,810), in further view of DeVries 
(6,289,309). 

As to claim 1 Gong teaches 

a method of dynamically re-configurable speech recognition comprising: 
determining an identity of a speaker based, at least in part, on a user identifier (col. 3 
lines 13-18) a terminal identifier (col. 3 lines 21-28; terminal identifier is the phone used to 
identify the speaker) 

repeatedly (continually) determining parameters of a background model based on 
sampled information collected at periodic time interval (Fig. 2, 0.3 delay, col. 2 lines 35-45) 
during a received voice request {incoming utterance} (produce an adapted model based on inputs 
from on-line noise estimations (background adaptation) and one-time adaptation (transducer 
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model), incoming utterance, col. 1, lines 42, 59-63, col. 2, lines 44-50 and Fig. 1, elements 1 1 & 
20). 

determining parameters (noise sample and utterance) of a transducer model (microphone 
or speaker) (Fig. 2; col. 5 lines 24-25); 

adapting a speech recognition model based on user-specific transformations 
corresponding to the determined identity of the speaker (col. 3 lines 5-20) and on at least one of 
the background model (background noise) (Fig. 1 element 21 recognition, element 19 
background noise, and col. 2 lines 59-61 steps 4-5); 

Gong does not teach rescoring ASR. 

However, Digalakis et al. do teach 

re-scoring automatic speech recognition using the speech recognition model comprising: 
generating word lattices representative of speech utterances in he received voice request 
(col. 11, lines 40-44); 

concatenating the word lattices into a single concatenated lattice (sentence hypothesis 
necessarily implies word lattices, co. 13, lines 45-46); 

applying at least one language model (language model) to the single concatenated lattice 
in order to determine word lattice inter-relationships (col. 13, lines 38-46); and 

determining information in the received voice request based on he re-score results of the 
speech recognition model (rescoring the N-best sentence hypothesis, col. 13, lines 45-46); and 

It would have been obvious to one of ordinary skill in the art at the time of the invention 
was made to modify Gong 's method of speaker adaptation by re-scoring ASR that generates and 
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links words in order to improve recognition performance for non-native speakers of American 
English, as taught by Digalakis et al., col. 13, lines 29-30. 

Gong in view of Digalakis et al. do not explicitly teach adjusting the periodic time 
interval based on the determined changes in the sample. 

However, DeVries et al. do teach 

adjusting the periodic time interval based on the determined changes in the sampled 
information (col. 6 lines 10-24). 

Therefore, it would have been obvious to one of ordinary skill in the art at the time of the 
invention to modify Gong in view of Digalakis et al. speaker identification because DeVries et 
al. teach that would produce noise tracking system that determines the effective time window in 
real time, so as to adapt to environmental changes in noise. (DeVries, col. 6 lines 18-23). 

Gong in view of Digalakis et al. do not explicitly teach user identifier being one of a 
terminal identifier. 

However, DeVries et al. do teach user identifier being one of a terminal identifier (col. 3 
lines 58-67; terminal identifier are numbers 0-9). 

Therefore, it would have been obvious to one of ordinary skill in the art at the time of the 
invention to modify Gong in view of Digalakis et al. speaker identification because DeVries et 
al. teach that would reduce values to recognize predetermined command words, in a mobile 
telephone environment, these may include the numbers 0-9 in order to control the operation of 
the device (DeVries, col. 3 lines 58-67). 



As to claim 5 Gong teaches 
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A system of dynamically re-configurable speech recognition comprising: 
a background model estimation circuit for repeatedly determining a background model at 
a periodic time interval during a voice request based, at least in part, on estimated background 
parameters based on collected sampled information (background noise is recorded and estimated, 
col. 2, lines 43-44; and col. 5 lines 24-34; the background noise model is implemented via a 
microphone and/or transducer which necessarily has the circuit for repeat determination of 
background noise as is needed in a noisy car environment); 

a transducer model estimation circuit for determining a transducer model of the voice 
request based, at least in part, on estimated transducer parameters (col. 2, lines 35-44; and col. 5 
lines 20-34); 

a background model adaptation circuit and a transducer model adaptation circuit for 
determining an adapted speech recognition model based on a speech recognition model and at 
least on of the background model (col. 5 lines 5-10) 

a lattice concatenation circuit that concatenates at least two speech lattices based on 
speech utterances in the received voice request into a signal lattice (col. 5 lines 5-34; speech 
recognition necessarily has a lattice link in order to determine the differences between speech 
and noise) 

Gong does not explicitly teach adapting the controller based on user identification. 
However, Digalakis et al. do teach 

a controller that applies at least one language model to the signal concatenated lattice to 
determine relationships between lattices (col. 1 1 lines 20-26 and col. 6 lines 10-24). 
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the controller is adapted to determine an identify of a speaker based, at least in part on a 
user identified and to apply user-specific transformations, corresponding to the identity of the 
speaker, to the speech recognition model a terminal identifier (Fig. 1-2 and col. 3 lines 20-25 and 
43-47; col. 3 lines 21-28; terminal identifier is the phone used to identify the speaker) 

Therefore, it would have been obvious to one of ordinary skill in the art at the time of the 
invention to modify the method of Gong's speaker identification into the system of Digalakis et 
al. because Digalakis et al. teach that would improve performance and robustness of a speech 
recognition system that is adapted to the speaker, and to the channel and the task (Digalakis, col. 
2 lines 24-26). 

Gong in view of Digalakis et al. does not explicitly teach adjusting the periodic time 
interval based on the determined changes in the sample. 
However, DeVries et al. do teach 

adjusting the periodic time interval based on the determined changes in the sampled 
information (col. 6 lines 10-24). 

Therefore, it would have been obvious to one of ordinary skill in the art at the time of the 
invention to modify Gong in view of Digalakis et al. speaker identification because DeVries et 
al. teach that would produce noise tracking system that determines the effective time window in 
real time, so as to adapt to environmental changes in noise. (DeVries, col. 6 lines 18-23). 

Gong in view of Digalakis et al. do not explicitly teach user identifier being one of a 
terminal identifier. 

However, DeVries et al. do teach wherein the user identifier being one of a terminal 
identifier (col. 3 lines 58-67; terminal identifier are numbers 0-9). 
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Therefore, it would have been obvious to one of ordinary skill in the art at the time of the 
invention to modify Gong in view of Digalakis et al. speaker identification because DeVries et 
al. teach that would reduce values to recognize predetermined command words, in a mobile 
telephone environment, these may include the numbers 0-9 in order to control the operation of 
the device (DeViers, col. 3 lines 58-67). 

As to claim 9 is directed toward a computer program with a computer readable program 
code to implement or execute the method of claim 1, and is similar in scope and content of claim 
1, therefore, claim 9 is rejected under similar rationale. 

As to claim 10, which depends on claim 9, Gong teaches 

instructions for periodically determining a new transducer model (col. 5 lines 24-34). 
As to claim 11, which depends on claim 10, Gong teaches 

the parameters of the background model are determined based on a first sample period ( 
Fig. 2; sample period for background noise is determined before speech utterance) 

the parameters of the transducer model are determined based on a second sample period 
(col. 5 lines 20-30; sample period for transducer model takes place during one-time adaptation 
(calibration), which takes place before on-line adaptation and thus requires a second, distinct 
sampling) 

As to claim 12, which depends on claim 10, Gong teaches 
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instructions for saving at least one of the background model (background noise is 
recorded and estimated, col. 2 lines 43-44 and col. 5 lines 24-34). 

Claim 13 directed toward a computer readable storage medium with a computer readable 
program code to implement or execute the method of claim 1, and is similar in scope and content 
of claim 1, therefore, claim 13 is rejected under similar rationale. 

Claim 14 is directed toward a computer readable storage medium with a computer 
readable program code to implement or execute the method of claim 1, and is similar in scope 
and content of claim 1, therefore, claim 14 is rejected under similar rationale. 

As to claim 15, which depends on claims 1, Gong teaches 

repeatedly determining the parameters of the transducer model (col. 5 lines 28-34). 
As to claim 16, which depends on claim 5, Gong teaches 

the transducer model estimation circuit (necessary circuit in recognizer, col. 5 lines 24-32 
and col. 1 line 15 and 31-34) is configured to repeatedly determine the transducer model at the 
periodic time interval (Fig. 2 0.3 delay, col. 2 lines 35-45). 

As to claim 17, which depends on claim 13, Gong teaches 

repeatedly determining the parameters of the transducer model (col. 5 lines 25-34). 
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As to claim 18, which depends on claim 14, Gong teaches 
determining the parameters of the transducer model (col 5 lines 28-34). 
Gong in view of Digalakis et al. does not explicitly teach adjusting the periodic time 
interval based, at least in part, on the collected first sampled information. 
However, DeVries et al. do teach 

adjusting the periodic time interval at least in part, on the collected first sampled 
information (col. 6 lines 10-24; DeVries et al. would necessarily use the first sampled 
information in a real-time application in order to readily determine the noise level changes which 
are analyzed using the forgetting factor in order to readily adapt to the changes in noise level). 

Therefore, it would have been obvious to one of ordinary skill in the art at the time of the 
invention to modify Gong 's speech in view of Digalakis et al. speaker identification because an 
artisan of ordinary skill in the art would produce a noise tracking system that determines the 
effective time window in real time, so as to optimally predict the noise power for the next frame 
because in an automobile environment, passing cars or the shifting of gears may introduce short- 
term non-stationary noise. (DeVries, col. 6 lines 18-23). 

As to claim 19, which depends on claim 19, Gong teaches 

interval of sample (Fig. 2). 

Gong in view of Digalakis et al. does not explicitly teach adjusting the length of the 
intervals. 

However, DeVries et al. do teach 

adjusting the length of the first periodic intervals based, at least in part, on a frequency 
(amplitude-frequency product, energy, room noise and speech, noise update speech frame, 
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forgetting factor predict noise power) of determined changes successively sampled ones of the 
first sampled information (adapt real time, forgetting factor, to predict noise power for the next 
frame, col. 8 lines 2-6, 21-24; col. 5 lines 48-51, col 6 lines 2-4, 10-1 1, 20-23). 

Therefore, it would have been obvious to one of ordinary skill in the art at the time of the 
invention to modify Gong in view of Digalakis et al. speaker identification because an artisan of 
ordinary skill in the art would adjust interval of frequency samples, so as to optimally predict the 
noise power for the next frame. (DeVries, col. 6 lines 18-23). 

4. Claims 22, 24, and 26 are rejected under 35 U.S.C. 103(a) as being unpatentable over 

Gong (6,418,41 1), in view of Digalakis et al. (5,864,810) in further view of DeVries (6,289,309), 

as applied to claim 1, and in further view of Dragosh et al. (6,078,886). 

As to claims 22, 24 and 26, which depend on claims 1, 5 and 14, Gong teaches 

user identifier (col. 2 lines 1 1-26 and Fig. 2). 

Gong in view of Digalakis et al. in further view of DeVries do not explicitly teach 
the user identifier is based the plurality of rules associated with the phone and at least one of a 
time of the day or a day of a week. 

However, Dragosh et al. do teach wherein the user identifier is based the plurality of rules 
associated with the phone and at least one of a time of the day or a day of a week (col. 5 lines 32- 
37 and col. 6 lines 29-35; and col. 8 lines 65-67; and col. 9 lines 1-4). 

Therefore, it would have been obvious to one of ordinary skill in the art at the time of the 
invention was made to modify Gong in view of Digalakis et al. in further view of DeVries et al.'s 
to identify rules associated with the phone and at least one time of the week, because an artisan 
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of ordinary skill in the art would use grammar rules for purpose of recognizing words and to 
initiate speech recognition task (Dragosh et al. col. 6 lines 28-34). 

5. Claims 2, 4, 6, 8, and 20 are rejected under 35 U.S.C. 103(a) as being unpatentable over 
Gong (6,418,41 1), in view of Digalakis et al. (5,864,810) in further view of DeVries (6,289,309), 
as applied to claim 1, and in further view of Thrasher et al. (2002/0052742). 

As to claims 2, which depends on claim 1, Gong teaches 

speech recognition modeling (Fig. 1 element 21). 

Gong in view of Digalakis et al. in further view of DeVries do not explicitly teach 
confidence score to generate word lattices. 
However, Thrasher et al. do teach 

generating a confidence score (confidence measure, page 3 paragraph 35 lines 2-6, Fig. 2, 
element 1 10) to determine whether the generated word lattices (page 3 paragraph 35 lines 2-6) 
are acceptable (identifiers indicating which patterns may have been improperly identified, page 3 
paragraph 35 lines 2-6; paragraph 36 lines 7-8; acoustical score that measures the "acceptability" 
of word lattices). 

Therefore, it would have been obvious to one of ordinary skill in the art at the time of the 
invention was made to modify Gong in view of Digalakis et al. in further view of DeVries et al.'s 
noise speech enhancement such that it generates a confidence score, because an artisan of 
ordinary skill in the art would identify proper patterns that would provide an accurate recognizer. 
(Thrasher et al., col. 3, paragraph 0035 lines 5-8). 
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As to claim 4, which depends on claim 2, Gong teaches 

saving at least one of the parameters of the background model and the transducer model 
(background noise is recorded and estimated, col. 2, lines 43-44; and col. 5 lines 24-34). 

As to claim 6, which depends on claim 5, Gong teaches 
speech recognition modeling (Fig. 1 element 21). 

Gong in view of Digalakis et al. in further view of DeVries do not teach confidence 
score to determine lattices. 

However, Thrasher et al. do teach 

generating a confidence score (confidence measure, page 3 paragraph 35 lines 2-6) after 
applying speech recognition model (language model, Fig. 2, element 1 10) to determine whether 
the lattices (page 3 paragraph 36 lines 7-8) are acceptable (identifiers indicating which patterns 
may have been improperly identified, col. 3, paragraphs 0035 lines 2-6; acoustical score that 
measures the "acceptability" of word lattices). 

Therefore, it would have been obvious to one of ordinary skill in the art at the time of the 
invention to modify Gong in view of Digalakis et al. in further view of DeVries et al.'s noise 
speech enhancement because an artisan of ordinary skill in the art would generate a confidence 
score to avoid poor recognition quality. (Thrasher et al., col. 3, paragraph 0038 lines 5-8). 

As to claim 8, which depends on claim 6, Gong teaches 

saving at least one of the parameters of the background model and the transducer model 
(background noise is recorded and estimated, col. 2, lines 43-44; and col. 5 lines 24-34). 
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determining the adaptation speech recognition model (adaptation of HMM for speaker 
and acoustic environment, col. 1, lines 38-40) based on at least one of the background model 
(background model is determined based on the samples taken during the sample period, col. 2 
lines 43-45 & element 18, Fig. 1). 

As to claim 20, which depends on claim 14, Gong teaches speech recognition (Fig. 1). 
Gong in view of Digalakis et al. do not explicitly teach confidence scoring. 
However, Thrasher et al. do teach 

generating a confidence score after applying the speech recognition model to determine 
whether the generated word lattices are acceptable (confidence measure based on probable 
sequences provided as a result of lattice, lattice have a lexical word, in recognized speech and 
acoustic score, page 3 paragraphs 34-36); 

comparing the confidence score to a predetermined value (page 3 paragraphs 32 and 35- 
36 and page 4 paragraph 40; user predetermines the value of the confidence score via listening to 
the results, user does comparison); and 

repeating automatic speech recognition (re-launch) of the received voice request based, at 
least in part, on a result of the comparing of the confidence score with the predetermined value 
(edit recognition of speech, user re-launches application, reinitializes hypothesis, page 4 
paragraph 40; user edits to reinitialize hypothesis if there is a problem with confidence score and 
the predetermined value). 

It would have been obvious to one of ordinary skill in the art at the time of the invention 
to modify Gong in view of Digalakis et al. in further view of DeVries et al.'s speech recognition 
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model to produce Thrasher et al.'s N-best alternatives in speech recognition because an artisan of 
ordinary skill in the art would produce an engine that is never considering more than a 
predetermined maximum number of sub-paths, allowing for quicker processing (Thrasher et al., 
page 6 paragraph 63 lines 1-2). 

6. Claims 3 and 7 are rejected under 35 U.S.C. 103(a) as being unpatentable over Gong 
(6,418,41 1), in view of Digalakis et al. (5,864,810) and DeVries (6,289,309), in view of 
Thrasher et al. (20020052742), as applied to claims 2 and 6, and in further view of Waibel et al. 
(5,712,957). 

As to claim 3 which depend on claim 2, Gong teaches 

the parameters of the background model are determined based on a first sample period 
(sample period for background noise is determined before speech utterance, Fig. 2); 

the parameters of the transducer model are determined based on a second sample period 
(sample period for transducer model takes place during one-time adaptation (calibration), which 
takes place before on-line adaptation and thus requires a second, distinct sampling, col. 5, lines 
23-28) 

Gong in view of Digalakis et al. and in further view of DeVries do not teach comparing 
confidence scores to determine weather to perform the ASR process again. 
However, Waibel et al. do teach 

the confidence score is compared to a predetermined value (threshold value) in order to 
determine weather to perform the automatic speech recognition process again (repeat again, col. 
1, lines 56-59). 
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Therefore, it would have been obvious to one of ordinary skill in the art at the time of the 
invention was made to modify Gong in combination with the speech recognition systems of 
Digalakis et al. and DeVries into Thrasher's method so that the confidence score is compared to 
a predetermined threshold value to repair misrecognition of speech. (Waibel col. 1 lines 9-12). 

Claim 7 is directed toward a system with a controller to implement or execute the method 
of claim 3, and is similar in scope and content of claim 3, therefore, claim 7 is rejected under 
similar rationale. 

7. Claims 21, 23 and 25 are rejected under 35 U.S.C. 103(a) as being unpatentable over Gong 
(6,418,41 1), in view of Digalakis et al. (5,864,810) in further view of DeVries (6,289,309), as 
applied to claims 1, 5 and 14, and in further view of Comerford et al. (6,107,935). 

As to claim 21, which depends on claim 1, Gong teaches 

user identification (col. 3 lines 5-20) 

Gong in view of Digalakis et al. in further view of DeVries do not explicitly teach the 
identifier comprises a calling phone number. 
However, Comerford et al. do teach 

wherein the user identifier comprises a calling phone number (col. 1 1 lines 64-67 and col. 
12 lines 1-20). 

Therefore, it would have been obvious to one of ordinary skill in the art at the time of the 
invention to implement Comerford et al.'s calling phone identifier into the method of Gong in 
view of Digalakis et al. in further view of DeVries because an artisan of ordinary skill in the art 
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would have allowed for successfully verified calls; when the requesting speaker is not verified, 
the name and number is flagged and saved, but not placed (Comerford et al. col. 1 1 lines 64-67 
and col. 12 lines 1-20). 

Claim 23 is directed toward a system with a controller to implement or execute the 
method of claim 3, and is similar in scope and content of claim 3, therefore, claim 23 is rejected 
under similar rationale. 

Claim 25 is directed toward a method to implement or execute the method of claim 3, and 
is similar in scope and content of claim 3, therefore, claim 25 is rejected under similar rationale. 

Conclusion 

8. THIS ACTION IS MADE FINAL, Applicant is reminded of the extension of time policy 
as set forth in 37 CFR 1 .136(a). 

A shortened statutory period for reply to this final action is set to expire THREE 
MONTHS from the mailing date of this action. In the event a first reply is filed within TWO 
MONTHS of the mailing date of this final action and the advisory action is not mailed until after 
the end of the THREE-MONTH shortened statutory period, then the shortened statutory period 
will expire on the date the advisory action is mailed, and any extension fee pursuant to 37 
CFR 1 .136(a) will be calculated from the mailing date of the advisory action. In no event, 
however, will the statutory period for reply expire later than SIX MONTHS from the mailing 
date of this final action. 
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9. Any inquiry concerning this communication or earlier communications from 

the examiner should be directed to Myriam Pierre whose telephone number is 571-272-761 1 . 

The examiner can normally be reached on 8:30-5:30. 

If attempts to reach the examiner by telephone are unsuccessful, the examiner's 
supervisor, Richemond Dorvil can be reached on 571-272-7602. The fax phone number for the 
organization where this application or proceeding is assigned is 571-273-8300. 
Information regarding the status of an application may be obtained from the 
Patent Application Information Retrieval (PAIR) system. Status information for published 
applications may be obtained from either Private PAIR or Public PAIR. Status information for 
unpublished applications is available through Private PAIR only. For more information about the 
PAIR system, see http://pair-direct.uspto.gov. Should you have questions on access to the Private 
PAIR system, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). 
MP 06/01/06 




